Search CORE

2,855 research outputs found

Intrusion Detection Systems Using Adaptive Regression Splines

Author: Abraham Ajith
Mukkamala Srinivas
Ramos Vitorino
Sung Andrew H.
Publication venue
Publication date: 01/01/2003
Field of study

Past few years have witnessed a growing recognition of intelligent techniques for the construction of efficient and reliable intrusion detection systems. Due to increasing incidents of cyber attacks, building effective intrusion detection systems (IDS) are essential for protecting information systems security, and yet it remains an elusive goal and a great challenge. In this paper, we report a performance analysis between Multivariate Adaptive Regression Splines (MARS), neural networks and support vector machines. The MARS procedure builds flexible regression models by fitting separate splines to distinct intervals of the predictor variables. A brief comparison of different neural network learning algorithms is also given

arXiv.org e-Print Archive

CiteSeerX

Malware Analysis on Android Using Supervised Machine Learning Techniques

Author: Rana Md Shohel
Sung Andrew H.
Publication venue: The Aquila Digital Community
Publication date: 01/10/2018
Field of study

In recent years, a widespread research is conducted with the growth of malware resulted in the domain of malware analysis and detection in Android devices. Android, a mobile-based operating system currently having more than one billion active users with a high market impact that have inspired the expansion of malware by cyber criminals. Android implements a different architecture and security controls to solve the problems caused by malware, such as unique user ID (UID) for each application, system permissions, and its distribution platform Google Play. There are numerous ways to violate that fortification, and how the complexity of creating a new solution is enlarged while cybercriminals progress their skills to develop malware. A community including developer and researcher has been evolving substitutes aimed at refining the level of safety where numerous machine learning algorithms already been proposed or applied to classify or cluster malware including analysis techniques, frameworks, sandboxes, and systems security. One of the most promising techniques is the implementation of artificial intelligence solutions for malware analysis. In this paper, we evaluate numerous supervised machine learning algorithms by implementing a static analysis framework to make predictions for detecting malware on Android

Aquila Digital Community

Improving Database Quality through Eliminating Duplicate Records

Author: Cather Martha E.
Sung Andrew H.
Wei Mingzhen
Publication venue: Scholars\u27 Mine
Publication date: 01/11/2006
Field of study

Redundant or duplicate data are the most troublesome problem in database management and applications. Approximate field matching is the key solution to resolve the problem by identifying semantically equivalent string values in syntactically different representations. This paper considers token-based solutions and proposes a general field matching framework to generalize the field matching problem in different domains. By introducing a concept of String Matching Points (SMP) in string comparison, string matching accuracy and efficiency are improved, compared with other commonly-applied field matching algorithms. The paper discusses the development of field matching algorithms from the developed general framework. The framework and corresponding algorithm are tested on a public data set of the NASA publication abstract database. The approach can be applied to address the similar problems in other databases

Missouri University of Science and Technology (Missouri S&T): Scholars' Mine

Quasi-phase-matched Faraday rotation in semiconductor waveguides with a magnetooptic cladding for monolithically integrated optical isolators

Author: Block Andrew D.
Dulal Prabesh
Holmes B.M.
Hutchings D.C.
Seaton Nicholas C. A.
Stadler Bethanie J. H.
Sung Sang-Yeob
Zhang Cui
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/12/2013
Field of study

Strategies are developed for obtaining nonreciprocal polarization mode conversion, also known as Faraday rotation, in waveguides in a format consistent with silicon-on-insulator or III–V semiconductor photonic integrated circuits. Fabrication techniques are developed using liftoff lithography and sputtering to obtain garnet segments as upper claddings, which have an evanescent wave interaction with the guided light. A mode solver approach is used to determine the modal Stokes parameters for such structures, and design considerations indicate that quasi-phase-matched Faraday rotation for optical isolator applications could be obtained with devices on the millimeter length scale

Crossref

Enlighten

Enhancing Machine Learning Performance with Continuous In-Session Ground Truth Scores: Pilot Study on Objective Skeletal Muscle Pain Intensity Prediction

Author: Faremi Boluwatife E.
Oliveira Nuno
Stavres Jonathon
Sung Andrew H.
Zhou Zhaoxian
Publication venue
Publication date: 01/08/2023
Field of study

Machine learning (ML) models trained on subjective self-report scores struggle to objectively classify pain accurately due to the significant variance between real-time pain experiences and recorded scores afterwards. This study developed two devices for acquisition of real-time, continuous in-session pain scores and gathering of ANS-modulated endodermal activity (EDA).The experiment recruited N = 24 subjects who underwent a post-exercise circulatory occlusion (PECO) with stretch, inducing discomfort. Subject data were stored in a custom pain platform, facilitating extraction of time-domain EDA features and in-session ground truth scores. Moreover, post-experiment visual analog scale (VAS) scores were collected from each subject. Machine learning models, namely Multi-layer Perceptron (MLP) and Random Forest (RF), were trained using corresponding objective EDA features combined with in-session scores and post-session scores, respectively. Over a 10-fold cross-validation, the macro-averaged geometric mean score revealed MLP and RF models trained with objective EDA features and in-session scores achieved superior performance (75.9% and 78.3%) compared to models trained with post-session scores (70.3% and 74.6%) respectively. This pioneering study demonstrates that using continuous in-session ground truth scores significantly enhances ML performance in pain intensity characterization, overcoming ground truth sparsity-related issues, data imbalance, and high variance. This study informs future objective-based ML pain system training.Comment: 18 pages, 2-page Appendix, 7 figure

arXiv.org e-Print Archive

Recommended from our members

Supervised Learning-Based tagSNP Selection for Genome-Wide Disease Classifications

Author: Chen Zhongxue
Huang Xudong
Liu Qingzhong
Sung Andrew H
Yang Jack
Yang Mary Qu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 26/04/2011
Field of study

Background: Comprehensive evaluation of common genetic variations through association of single nucleotide polymorphisms (SNPs) with complex human diseases on the genome-wide scale is an active area in human genome research. One of the fundamental questions in a SNP-disease association study is to find an optimal subset of SNPs with predicting power for disease status. To find that subset while reducing study burden in terms of time and costs, one can potentially reconcile information redundancy from associations between SNP markers. Results: We have developed a feature selection method named Supervised Recursive Feature Addition (SRFA). This method combines supervised learning and statistical measures for the chosen candidate features/SNPs to reconcile the redundancy information and, in doing so, improve the classification performance in association studies. Additionally, we have proposed a Support Vector based Recursive Feature Addition (SVRFA) scheme in SNP-disease association analysis. Conclusions: We have proposed using SRFA with different statistical learning classifiers and SVRFA for both SNP selection and disease classification and then applying them to two complex disease data sets. In general, our approaches outperform the well-known feature selection method of Support Vector Machine Recursive Feature Elimination and logic regression-based SNP selection for disease classification in genetic association studies. Our study further indicates that both genetic and environmental variables should be taken into account when doing disease predictions and classifications for the most complex human diseases that have gene-environment interactions

Harvard University - DASH

Smartphone Sensor-Based Activity Recognition by Using Machine Learning and Deep Learning Algorithms

Author: Andrew Sung H.
Liu Qingzhong
Mengyu Qiao
Prathyusha Uduthalapally
Sarbagya Shakya Ratna
Zhaoxian Zhou
Publication venue: 'EJournal Publishing'
Publication date: 01/01/2018
Field of study

Article originally published International Journal of Machine Learning and ComputingSmartphones are widely used today, and it becomes possible to detect the user's environmental changes by using the smartphone sensors, as demonstrated in this paper where we propose a method to identify human activities with reasonably high accuracy by using smartphone sensor data. First, the raw smartphone sensor data are collected from two categories of human activity: motion-based, e.g., walking and running; and phone movement-based, e.g., left-right, up-down, clockwise and counterclockwise movement. Firstly, two types of features extraction are designed from the raw sensor data, and activity recognition is analyzed using machine learning classification models based on these features. Secondly, the activity recognition performance is analyzed through the Convolutional Neural Network (CNN) model using only the raw data. Our experiments show substantial improvement in the result with the addition of features and the use of CNN model based on smartphone sensor data with judicious learning techniques and good feature designs

Scholarly Works @ SHSU (Sam Houston State University)

Supervised learning-based tagSNP selection for genome-wide disease classifications

Author: Chen Zhongxue
Huang Xudong
Liu Qingzhong
Sung Andrew H
Yang Jack
Yang Mary Qu
Publication venue: BioMed Central
Publication date: 25/07/2007
Field of study

The article was originally published by BMC Genomics. doi:10.1186/1471-2164-9-S1-S6Comprehensive evaluation of common genetic variations through association of single nucleotide polymorphisms (SNPs) with complex human diseases on the genome-wide scale is an active area in human genome research. One of the fundamental questions in a SNP-disease association study is to find an optimal subset of SNPs with predicting power for disease status. To find that subset while reducing study burden in terms of time and costs, one can potentially reconcile information redundancy from associations between SNP markersResearch supports received from ICASA (Institute for Complex Additive Systems Analysis, a division of New Mexico Tech) and the Radiology Department of Brigham and Women's Hospital (BWH) are gratefully acknowledged. The authors highly appreciate Dr. Liang at SUNY-Buffalo for her invaluable help and insightful discussion during this study and Ms. Kim Lawson at BWH Radiology Department for her manuscript editing and very constructive comments.Supervised Recursive Feature AdditionsSupport Vector bases Recursive Feature Additioncomplex diseasegeneticsdisease prediction

Crossref

Scholarly Works @ SHSU (Sam Houston State University)

Springer - Publisher Connector

PubMed Central

Influence of Machine Learning vs. Ranking Algorithm on the Critical Dimension

Author: Liu Qingzhong
Sung Andrew H.
Suryakumar Divya
Publication venue: International Journal of Future Computer and Communication
Publication date: 01/06/2013
Field of study

Article originally published in International Journal of Future Computer and CommunicationThe critical dimension is the minimum number of features required for a learning machine to perform with “high” accuracy, which for a specific dataset is dependent upon the learning machine and the ranking algorithm. Discovering the critical dimension, if one exists for a dataset, can help to reduce the feature size while maintaining the learning machine’s performance. It is important to understand the influence of learning machines and ranking algorithms on critical dimension to reduce the feature size effectively. In this paper we experiment with three ranking algorithms and three learning machines on several datasets to study their combined effect on the critical dimension. Results show the ranking algorithm has greater influence on the critical dimension than the learning machine.ICASA (Institute for Complex Additive Systems Analysis) of New Mexico Tech and the National Institute of Justice, U.S. Department of Justice (Award No. 2010-DN-BX-K223

Scholarly Works @ SHSU (Sam Houston State University)

Feature Selection and Classification of MAQC-II Breast Cancer and Multiple Myeloma Microarray Gene Expression Data

Author: Andrew H. Sung
Jianzhong Liu
Qingzhong Liu
Raya Khanin
Xudong Huang
Youping Deng
Zhongxue Chen
Publication venue: Public Library of Science
Publication date: 11/12/2009
Field of study

Microarray data has a high dimension of variables but available datasets usually have only a small number of samples, thereby making the study of such datasets interesting and challenging. In the task of analyzing microarray data for the purpose of, e.g., predicting gene-disease association, feature selection is very important because it provides a way to handle the high dimensionality by exploiting information redundancy induced by associations among genetic markers. Judicious feature selection in microarray data analysis can result in significant reduction of cost while maintaining or improving the classification or prediction accuracy of learning machines that are employed to sort out the datasets. In this paper, we propose a gene selection method called Recursive Feature Addition (RFA), which combines supervised learning and statistical similarity measures. We compare our method with the following gene selection methods

Public Library of Science (PLOS)

Aquila Digital Community

Crossref

Scholarly Works @ SHSU (Sam Houston State University)

Harvard University - DASH

Directory of Open Access Journals

PubMed Central

DigitalCommons@Florida International University